Skip to content

Recipes to evaluate CORDEX-CMIP5 models#4199

Draft
sloosvel wants to merge 25 commits into
mainfrom
dev_cordex5_recipe
Draft

Recipes to evaluate CORDEX-CMIP5 models#4199
sloosvel wants to merge 25 commits into
mainfrom
dev_cordex5_recipe

Conversation

@sloosvel

@sloosvel sloosvel commented Sep 15, 2025

Copy link
Copy Markdown
Contributor

Description

This PR contains tests recipes to produce the plots for WP2 of ESO4clima. It provides informational value, feel free to open another PR to merge the end product.

recipe_cordex-cmip5.yml

recipe_cordex-cmip5.yml plots maps for the CORDEX data and, if an observational dataset is present, computes the bias against it. The maps are useful to visually inspect the data and detect issues in the values or the grid. Some example plots that can be produced are the following:

Screenshot from 2025-07-18 09-30-06 map_ts_CORDEX_CNRM-CERFACS-CNRM-CM5_ALADIN63_day

recipe_cordex-cmip5_perfmetrics.yml

Produces the perfomance metrics plots for variable ts, which has been previously examined using the recipe to produce maps to ensure the data is consistent. It is pending to be expanded to support more variables and more CORDEX datasets. The current recipe contains a full list of CORDEX datasets that is commented and that should be uncommented when all CORDEX metadata and data issues are fixed in ESMValCore. The current recipe contains a sample list of datasets that produces a test performance plot for a single variable against two observational datasets, just to prove that there are no issues that prevent the diagnostic to be executed. It produces the following plot:

performance_ts

recipe_cordex-cmip5_perfmetrics_all

Produces the performance metrics plot for several variables in CORDEX and ESA-CCI data. The CORDEX datasets are the subset of the EUR-11 domain that are common for all variables. However the data has not visually inspected beforehand with recipe_cordex-cmip5,yml, so there may be data issues that are going unnoticed. It produces the following plot:

performance_all

Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.

New or updated recipe/diagnostic

New or updated data reformatting script


To help with the number of pull requests:

@sloosvel sloosvel changed the title Recipe to evaluate CORDEX-CMIP5 models Recipes to evaluate CORDEX-CMIP5 models Nov 26, 2025
@bouweandela

bouweandela commented Apr 30, 2026

Copy link
Copy Markdown
Member

To generate a recipe with all available datasets for the variable ts, run this script:

import yaml

from esmvalcore.config import CFG
from esmvalcore.dataset import Dataset, datasets_to_recipe


def main() -> None:
    """Make a recipe with all CORDEX data we can find."""
    # Make sure to configure the data sources before running this script.
    # To configure ESGF as a data source, run:
    # `esmvaltool config copy data-esmvalcore-esgf.yml`
    CFG["search_data"] = "complete"

    template = Dataset(
        project="CORDEX",
        mip="day",
        short_name="ts",
        domain="EUR-11",
        dataset="*",
        institute="*",
        rcm_version="*",
        driver="*",
        ensemble="*",
    )
    datasets = tuple(template.from_files())
    print(
        yaml.safe_dump(
            datasets_to_recipe(
                [d.copy(diagnostic="diagnostic") for d in datasets],
            ),
        ),
    )


if __name__ == "__main__":
    main()

Result:

Details

datasets:
- dataset: ALADIN53
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
- dataset: ALADIN53
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
- dataset: ALADIN63
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v2
- dataset: ALADIN63
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
- dataset: ALADIN63
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
- dataset: ALADIN63
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
- dataset: ALADIN63
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
- dataset: CCLM4-8-17
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: CLMcom
  rcm_version: v1
- dataset: CCLM4-8-17
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: CLMcom
  rcm_version: v1
- dataset: CCLM4-8-17
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: CLMcom
  rcm_version: v1
- dataset: CCLM4-8-17
  driver: MIROC-MIROC5
  ensemble: r1i1p1
  institute: CLMcom
  rcm_version: v1
- dataset: CCLM4-8-17
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: CLMcom
  rcm_version: v1
- dataset: CCLM4-8-17
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: CLMcom
  rcm_version: v1
- dataset: COSMO-crCLIM-v1-1
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
- dataset: COSMO-crCLIM-v1-1
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
- dataset: COSMO-crCLIM-v1-1
  driver: ICHEC-EC-EARTH
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
- dataset: COSMO-crCLIM-v1-1
  driver: ICHEC-EC-EARTH
  ensemble: r3i1p1
  institute: CLMcom-ETH
  rcm_version: v1
- dataset: COSMO-crCLIM-v1-1
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: CLMcom-ETH
  rcm_version: v1
- dataset: COSMO-crCLIM-v1-1
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
- dataset: COSMO-crCLIM-v1-1
  driver: MPI-M-MPI-ESM-LR
  ensemble: r(1:3)i1p1
  institute: CLMcom-ETH
  rcm_version: v1
- dataset: COSMO-crCLIM-v1-1
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
- dataset: HIRHAM5
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v2
- dataset: HIRHAM5
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v1
- dataset: HIRHAM5
  driver: ICHEC-EC-EARTH
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v1
- dataset: HIRHAM5
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: DMI
  rcm_version: v1
- dataset: HIRHAM5
  driver: ICHEC-EC-EARTH
  ensemble: r3i1p1
  institute: DMI
  rcm_version: v2
- dataset: HIRHAM5
  driver: IPSL-IPSL-CM5A-MR
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v1
- dataset: HIRHAM5
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v2
- dataset: HIRHAM5
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v1
- dataset: HIRHAM5
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v3
- dataset: HadREM3-GA7-05
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v2
- dataset: HadREM3-GA7-05
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v1
- dataset: HadREM3-GA7-05
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: MOHC
  rcm_version: v1
- dataset: HadREM3-GA7-05
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v1
- dataset: HadREM3-GA7-05
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v1
- dataset: HadREM3-GA7-05
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v1
- dataset: RACMO22E
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: KNMI
  rcm_version: v2
- dataset: RACMO22E
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: KNMI
  rcm_version: v1
- dataset: RACMO22E
  driver: ICHEC-EC-EARTH
  ensemble: r1i1p1
  institute: KNMI
  rcm_version: v1
- dataset: RACMO22E
  driver: ICHEC-EC-EARTH
  ensemble: r3i1p1
  institute: KNMI
  rcm_version: v1
- dataset: RACMO22E
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: KNMI
  rcm_version: v1
- dataset: RACMO22E
  driver: IPSL-IPSL-CM5A-MR
  ensemble: r1i1p1
  institute: KNMI
  rcm_version: v1
- dataset: RACMO22E
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: KNMI
  rcm_version: v2
- dataset: RACMO22E
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: KNMI
  rcm_version: v1
- dataset: RACMO22E
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: KNMI
  rcm_version: v1
- dataset: RCA4
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
- dataset: RCA4
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
- dataset: RCA4
  driver: ICHEC-EC-EARTH
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
- dataset: RCA4
  driver: ICHEC-EC-EARTH
  ensemble: r3i1p1
  institute: SMHI
  rcm_version: v1
- dataset: RCA4
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: SMHI
  rcm_version: v1
- dataset: RCA4
  driver: IPSL-IPSL-CM5A-MR
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
- dataset: RCA4
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
- dataset: RCA4
  driver: MPI-M-MPI-ESM-LR
  ensemble: r(2:3)i1p1
  institute: SMHI
  rcm_version: v1
- dataset: RCA4
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1a
- dataset: RCA4
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
- dataset: REMO2009
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: MPI-CSC
  rcm_version: v1
- dataset: REMO2009
  driver: MPI-M-MPI-ESM-LR
  ensemble: r(1:2)i1p1
  institute: MPI-CSC
  rcm_version: v1
- dataset: REMO2015
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v2
- dataset: REMO2015
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
- dataset: REMO2015
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: GERICS
  rcm_version: v1
- dataset: REMO2015
  driver: IPSL-IPSL-CM5A-LR
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
- dataset: REMO2015
  driver: IPSL-IPSL-CM5A-MR
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
- dataset: REMO2015
  driver: MIROC-MIROC5
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
- dataset: REMO2015
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
- dataset: REMO2015
  driver: MPI-M-MPI-ESM-LR
  ensemble: r3i1p1
  institute: GERICS
  rcm_version: v1
- dataset: REMO2015
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
- dataset: REMO2015
  driver: NOAA-GFDL-GFDL-ESM2G
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
- dataset: RegCM4-2
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: DHMZ
  rcm_version: v1
- dataset: RegCM4-6
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v2
- dataset: RegCM4-6
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v1
- dataset: RegCM4-6
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: ICTP
  rcm_version: v1
- dataset: RegCM4-6
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v1
- dataset: RegCM4-6
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v1
- dataset: RegCM4-6
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v1
- dataset: WRF361H
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: UHOH
  rcm_version: v1
- dataset: WRF381P
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: IPSL
  rcm_version: v2
- dataset: WRF381P
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: IPSL
  rcm_version: v1
- dataset: WRF381P
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: IPSL
  rcm_version: v1
- dataset: WRF381P
  driver: IPSL-IPSL-CM5A-MR
  ensemble: r1i1p1
  institute: IPSL
  rcm_version: v1
- dataset: WRF381P
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: IPSL
  rcm_version: v1
- dataset: WRF381P
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: IPSL
  rcm_version: v1
- dataset: WRF381P
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: IPSL
  rcm_version: v1
diagnostics:
  diagnostic:
    variables:
      ts:
        domain: EUR-11
        mip: day
        project: CORDEX

@bouweandela

Copy link
Copy Markdown
Member

And here is another version that only lists those datasets that also have sftlf available:

import yaml

from esmvalcore.config import CFG
from esmvalcore.dataset import Dataset, datasets_to_recipe


def main() -> None:
    """Make a recipe with all CORDEX data we can find."""
    # Make sure to configure the data sources before running this script.
    # To configure ESGF as a data source, run:
    # `esmvaltool config copy data-esmvalcore-esgf.yml`
    CFG["search_data"] = "complete"

    template = Dataset(
        project="CORDEX",
        mip="day",
        short_name="ts",
        domain="EUR-11",
        dataset="*",
        institute="*",
        rcm_version="*",
        driver="*",
        ensemble="*",
    )
    datasets = []
    for dataset in template.from_files():
        dataset.add_supplementary(short_name="sftlf", mip="fx", ensemble="*")
        result = next(dataset.from_files())
        if result.supplementaries:
            datasets.append(result)
    recipe = datasets_to_recipe(
        [d.copy(diagnostic="diagnostic") for d in datasets],
    )
    for dataset in recipe["datasets"]:
        dataset["supplementary_variables"][0].pop("diagnostic", None)
    print(yaml.safe_dump(recipe))


if __name__ == "__main__":
    main()

Result:

Details

datasets:
- dataset: ALADIN53
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: ALADIN53
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: ALADIN63
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v2
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: ALADIN63
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: ALADIN63
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: ALADIN63
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: ALADIN63
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: CNRM
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: CCLM4-8-17
  driver: MIROC-MIROC5
  ensemble: r1i1p1
  institute: CLMcom
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: COSMO-crCLIM-v1-1
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: COSMO-crCLIM-v1-1
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: COSMO-crCLIM-v1-1
  driver: ICHEC-EC-EARTH
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
  supplementary_variables: &id001
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: COSMO-crCLIM-v1-1
  driver: ICHEC-EC-EARTH
  ensemble: r3i1p1
  institute: CLMcom-ETH
  rcm_version: v1
  supplementary_variables: *id001
- dataset: COSMO-crCLIM-v1-1
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: CLMcom-ETH
  rcm_version: v1
  supplementary_variables: *id001
- dataset: COSMO-crCLIM-v1-1
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: COSMO-crCLIM-v1-1
  driver: MPI-M-MPI-ESM-LR
  ensemble: r(1:3)i1p1
  institute: CLMcom-ETH
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: COSMO-crCLIM-v1-1
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: CLMcom-ETH
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: HIRHAM5
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v2
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HIRHAM5
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HIRHAM5
  driver: ICHEC-EC-EARTH
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v1
  supplementary_variables: &id002
  - mip: fx
    short_name: sftlf
- dataset: HIRHAM5
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: DMI
  rcm_version: v1
  supplementary_variables: *id002
- dataset: HIRHAM5
  driver: ICHEC-EC-EARTH
  ensemble: r3i1p1
  institute: DMI
  rcm_version: v2
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HIRHAM5
  driver: IPSL-IPSL-CM5A-MR
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HIRHAM5
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v2
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HIRHAM5
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HIRHAM5
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: DMI
  rcm_version: v3
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HadREM3-GA7-05
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v2
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HadREM3-GA7-05
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HadREM3-GA7-05
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: MOHC
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HadREM3-GA7-05
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HadREM3-GA7-05
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: HadREM3-GA7-05
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: MOHC
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: RCA4
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: RCA4
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: RCA4
  driver: ICHEC-EC-EARTH
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
  supplementary_variables: &id003
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: RCA4
  driver: ICHEC-EC-EARTH
  ensemble: r3i1p1
  institute: SMHI
  rcm_version: v1
  supplementary_variables: *id003
- dataset: RCA4
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: SMHI
  rcm_version: v1
  supplementary_variables: *id003
- dataset: RCA4
  driver: IPSL-IPSL-CM5A-MR
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: RCA4
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: RCA4
  driver: MPI-M-MPI-ESM-LR
  ensemble: r(2:3)i1p1
  institute: SMHI
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: RCA4
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1a
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: RCA4
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: SMHI
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: REMO2015
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v2
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: REMO2015
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: GERICS
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: REMO2015
  driver: IPSL-IPSL-CM5A-LR
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: REMO2015
  driver: IPSL-IPSL-CM5A-MR
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: REMO2015
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: REMO2015
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: REMO2015
  driver: NOAA-GFDL-GFDL-ESM2G
  ensemble: r1i1p1
  institute: GERICS
  rcm_version: v1
  supplementary_variables:
  - ensemble: r0i0p0
    mip: fx
    short_name: sftlf
- dataset: RegCM4-2
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: DHMZ
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: RegCM4-6
  driver: CNRM-CERFACS-CNRM-CM5
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v2
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: RegCM4-6
  driver: ECMWF-ERAINT
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: RegCM4-6
  driver: ICHEC-EC-EARTH
  ensemble: r12i1p1
  institute: ICTP
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: RegCM4-6
  driver: MOHC-HadGEM2-ES
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: RegCM4-6
  driver: MPI-M-MPI-ESM-LR
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
- dataset: RegCM4-6
  driver: NCC-NorESM1-M
  ensemble: r1i1p1
  institute: ICTP
  rcm_version: v1
  supplementary_variables:
  - mip: fx
    short_name: sftlf
diagnostics:
  diagnostic:
    variables:
      ts:
        domain: EUR-11
        mip: day
        project: CORDEX

@CLAassistant

CLAassistant commented May 20, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ bouweandela
✅ ghossh
❌ sloosvel
You have signed the CLA already but the status is still pending? Let us recheck it.

@bouweandela bouweandela modified the milestones: v2.15.0, v2.16.0 Jun 10, 2026
@bouweandela

Copy link
Copy Markdown
Member

@axel-lauer @hb326 Do you have a suggestion for how many and which variables should be in the portrait plot for the ESO4clima D6.2.1 deliverable? And which observational datasets? We have clt, rsus, sic, and ts in esmvaltool/recipes/examples/recipe_cordex-cmip5_perfmetrics_all.yml so far, is that enough or do you think it would be better to add some more?

@axel-lauer

Copy link
Copy Markdown
Contributor

@axel-lauer @hb326 Do you have a suggestion for how many and which variables should be in the portrait plot for the ESO4clima D6.2.1 deliverable? And which observational datasets? We have clt, rsus, sic, and ts in esmvaltool/recipes/examples/recipe_cordex-cmip5_perfmetrics_all.yml so far, is that enough or do you think it would be better to add some more?

I don't think we promised a specific number of variables for this deliverable. The deliverable is defined as "New performance metrics plots with ESA-CCIdata produced with the project-enhanced ESMValTool forCORDEX-CMIP5 models that were not previously supported."

For this reason, I think it would be great to have a few more variables such as (if available) lwp, clivi, tos, sm, rlut, swcre, lwcre, prw, snw, od550aer, toz.

@bouweandela

Copy link
Copy Markdown
Member

@axel-lauer Thanks for the list, the following variables are not available in the CORDEX CMOR table:

  • lwcre (derived from unavailable rlutcs)
  • swcre (derived from unavailable rsutcs)
  • od550aer
  • tos
  • sm (derived from unavailable mrsos)
  • toz (derived from unavailable o3 or tro3)

so that leaves the following variables: clivi, clt, lwp, prw, rsus, rlut, sic, snw, ts.

@bouweandela

bouweandela commented Jun 23, 2026

Copy link
Copy Markdown
Member

Wrote a script to generate the complete recipe_perfmetrics_CORDEX-CMIP5.yml:

Details
"""Script to generate a performance metrics recipe for CORDEX-CMIP5 data."""

import copy
import textwrap

import yaml

from esmvalcore.config import CFG
from esmvalcore.dataset import Dataset, datasets_to_recipe


class FlowStyleDict(dict):
    """Mapping that should be rendered in YAML flow style."""


class FlowStyleDumper(yaml.SafeDumper):
    """YAML dumper with selective flow-style mappings."""


def _represent_flow_style_dict(
    dumper: yaml.SafeDumper,
    data: FlowStyleDict,
) -> yaml.nodes.MappingNode:
    return dumper.represent_mapping(
        "tag:yaml.org,2002:map",
        {k: data[k] for k in sorted(data)},
        flow_style=True,
    )


FlowStyleDumper.add_representer(
    FlowStyleDict,
    _represent_flow_style_dict,
)


def main() -> None:
    """Make a recipe with all CORDEX data we can find."""
    # Make sure to configure the data sources before running this script.
    # To configure ESGF as a data source, run:
    # `esmvaltool config copy data-esmvalcore-esgf.yml`
    CFG["search_data"] = "complete"

    # Set up the documentation and preprocessors.
    preprocessor = {
        "custom_order": True,
        "monthly_statistics": {
            "operator": "mean",
        },
        "climate_statistics": {
            "period": "full",
            "operator": "mean",
        },
        "mask_landsea": {
            "mask_out": "sea",
        },
        "regrid": {
            "target_grid": "EUR-11",
            "scheme": "linear",
        },
        "distance_metric": {
            "metric": "weighted_rmse",
        },
        "multi_model_statistics": {
            "span": "overlap",
            "statistics": [
                {"operator": "mean"},
                {"operator": "median"},
            ],
            "exclude": ["reference_dataset"],
        },
    }
    land_preprocessor = copy.deepcopy(preprocessor)
    land_preprocessor["mask_landsea"]["mask_out"] = "sea"
    sea_preprocessor = copy.deepcopy(preprocessor)
    sea_preprocessor["mask_landsea"]["mask_out"] = "land"
    preprocessor.pop("mask_landsea")

    print(
        yaml.safe_dump(
            {
                "documentation": {
                    "title": "CORDEX-CMIP5 performance metrics",
                    "description": textwrap.dedent("""
                        This is an example recipe that computes performance metrics for CORDEX-CMIP5 models.
                        The outcome of this recipe is used for deliverable DX.X of the project ESO4Clima.
                    """).strip(),
                    "authors": ["loosveldt-tomas_saskia", "andela_bouwe"],
                    "maintainer": ["andela_bouwe"],  # TODO: add Supriyo
                },
                "preprocessors": {
                    "default": preprocessor,
                    "land": land_preprocessor,
                    "sea": sea_preprocessor,
                },
            },
            sort_keys=False,
        ),
    )

    # Define the datasets to include in the recipe.
    reference_datasets = {
        "clivi": Dataset(
            short_name="clivi",
            mip="Amon",
            project="OBS6",
            tier=2,
            dataset="ESACCI-CLOUD",
            version="v3.0-AVHRR-AMPM",
            type="sat",
            reference_for_metric=True,
        ),
        "clt": Dataset(
            short_name="clt",
            mip="day",
            project="OBS6",
            tier=2,
            dataset="ESACCI-CLOUD",
            version="v3.0-AVHRR-AMPM",
            type="sat",
            reference_for_metric=True,
        ),
        "lwp": Dataset(
            short_name="lwp",
            mip="Amon",
            project="OBS6",
            tier=2,
            dataset="ESACCI-CLOUD",
            version="v3.0-AVHRR-AMPM",
            type="sat",
            reference_for_metric=True,
            derive=False,
        ),
        "prw": Dataset(
            short_name="prw",
            mip="Eday",
            project="OBS6",
            tier=3,
            dataset="ESACCI-WATERVAPOUR",
            version="CDR2-L3-COMBI-05deg-fv3.1",
            type="sat",
            reference_for_metric=True,
        ),
        "rlut": Dataset(
            short_name="rlut",
            mip="Amon",
            project="OBS6",
            tier=2,
            dataset="ESACCI-CLOUD",
            version="v3.0-AVHRR-AMPM",
            type="sat",
            reference_for_metric=True,
        ),
        "rsus": Dataset(
            short_name="rsus",
            mip="Amon",
            project="OBS6",
            tier=2,
            dataset="ESACCI-CLOUD",
            type="sat",
            version="v3.0-AVHRR-AMPM",
            reference_for_metric=True,
        ),
        "sic": Dataset(
            short_name="sic",
            mip="SIday",
            project="OBS6",
            tier=2,
            dataset="ESACCI-SEAICE",
            version="L4-SICONC-RE-SSMI-12.5kmEASE2-fv3.0-NH",
            type="sat",
            reference_for_metric=True,
        ),
        "snw": Dataset(
            short_name="snw",
            mip="day",
            project="OBS6",
            tier=2,
            dataset="ESACCI-SNOW",
            version="v2.0",
            type="sat",
            reference_for_metric=True,
        ),
        "lst": Dataset(
            short_name="ts",
            mip="Amon",
            project="OBS",
            tier=2,
            dataset="ESACCI-LST",
            version="1.00",
            type="sat",
            reference_for_metric=True,
        ),
        "sst": Dataset(
            short_name="tos",
            mip="Oday",
            project="OBS6",
            tier=2,
            dataset="ESACCI-SST",
            version="3.0-L4-analysis",
            type="sat",
            reference_for_metric=True,
        ),
    }
    reference_datasets["lst"].add_supplementary(
        short_name="sftlf",
        skip=True,
    )
    reference_datasets["sst"].add_supplementary(
        short_name="sftof",
        skip=True,
    )
    for variable_group, dataset in reference_datasets.items():
        dataset.set_facet("variable_group", variable_group, persist=False)

    datasets = list(reference_datasets.values())
    for short_name in [
        "clivi",
        "clt",
        "clwvi",  # "lwp" is derived from "clwvi" and "clivi"
        "prw",
        "rsus",
        "rlut",
        "sic",
        "snw",
        "ts",
    ]:
        template = Dataset(
            project="CORDEX",
            mip="day",
            short_name=short_name,
            domain="EUR-11",
            dataset="*",
            exp="historical",
            institute="*",
            rcm_version="*",
            driver="*",
            ensemble="*",
        )
        for dataset in template.from_files():
            if dataset.facets["short_name"] == "clwvi":
                # "lwp" is derived from "clwvi" and "clivi", but "clwvi" is
                # not required for the recipe.
                dataset.facets["short_name"] = "lwp"
            if dataset.facets["short_name"] == "ts":
                # "ts" needs to be separated into land and sea surface
                # temperature in order to compare with LST and SST reference
                # datasets.
                dataset.add_supplementary(
                    short_name="sftlf",
                    mip="fx",
                    ensemble="*",
                    exp="*",
                    # Not entirely sure if driver and institute should be
                    # wildcards here or if grids differ between drivers and
                    # institutes.
                    driver="*",
                    institute="*",
                )
                result = next(dataset.from_files())
                if result.supplementaries:
                    result.supplementaries.clear()  # do not include the supplementary in the recipe explicitly, it will be added automatically.
                    lst = result.copy(preprocessor="land")
                    lst.set_facet("variable_group", "lst", persist=False)
                    datasets.append(lst)
                    sst = result.copy(preprocessor="sea")
                    sst.set_facet("variable_group", "sst", persist=False)
                    datasets.append(sst)
            else:
                datasets.append(dataset)

    recipe = datasets_to_recipe(
        [d.copy(diagnostic="diagnostic") for d in datasets],
    )
    # Drop "diagnostic" from supplementary variables - > is this a bug in ESMValCore?
    for variable in recipe["diagnostics"]["diagnostic"]["variables"].values():
        for dataset in variable["additional_datasets"]:
            for supplementary in dataset.get("supplementary_variables", []):
                supplementary.pop("diagnostic", None)

    # Configure reference datasets.
    for variable_group, variable in recipe["diagnostics"]["diagnostic"][
        "variables"
    ].items():
        if variable_group in reference_datasets:
            variable["reference_dataset"] = reference_datasets[
                variable_group
            ].facets["dataset"]

    # Move common facets to the variable level.
    for variable_group, variable in recipe["diagnostics"]["diagnostic"][
        "variables"
    ].items():
        variable["project"] = "CORDEX"
        variable["mip"] = "day"
        variable["domain"] = "EUR-11"
        variable["exp"] = "historical"
        variable["ensemble"] = "r1i1p1"
        variable["rcm_version"] = "v1"
        variable["timerange"] = "2003/2005"
        if variable_group == "lwp":
            variable["derive"] = True
        if variable_group == "lst":
            variable["preprocessor"] = "land"
            variable["short_name"] = "ts"
        elif variable_group == "sst":
            variable["preprocessor"] = "sea"
            variable["short_name"] = "ts"
        else:
            variable["preprocessor"] = "default"
        for dataset in variable["additional_datasets"]:
            for key in list(dataset.keys()):
                if dataset[key] == variable.get(key):
                    dataset.pop(key)

    # TODO: check that moving common datasets to "datasets" section works as
    # expected.
    first_datasets = next(
        iter(recipe["diagnostics"]["diagnostic"]["variables"].values())
    )["additional_datasets"]
    for dataset in first_datasets:
        if all(
            dataset
            in recipe["diagnostics"]["diagnostic"]["variables"][var][
                "additional_datasets"
            ]
            for var in recipe["diagnostics"]["diagnostic"]["variables"]
        ):
            recipe["datasets"].append(dataset)
            for var in recipe["diagnostics"]["diagnostic"]["variables"]:
                recipe["diagnostics"]["diagnostic"]["variables"][var][
                    "additional_datasets"
                ].remove(dataset)

    # Set up the diagnostic script.
    recipe["diagnostics"]["diagnostic"]["scripts"] = {
        "portrait": {
            "script": "portrait_plot.py",
            "x_by": "alias",
            "y_by": "variable",
            "group_by": "project",
            "normalize": "centered_median",
            "default_split": "Ref1",
            "nan_color": "#bdbdbd",
            "plot_kwargs": {
                "vmin": -0.5,
                "vmax": +0.5,
            },
            "cbar_kwargs": {
                "label": "Relative RMSE",
                "extend": "both",
            },
        },
    }

    # Make the recipe more readable by using flow style for the datasets.
    recipe["datasets"] = [
        FlowStyleDict(dataset) for dataset in recipe["datasets"]
    ]
    for variable in recipe["diagnostics"]["diagnostic"]["variables"].values():
        variable["additional_datasets"] = [
            FlowStyleDict(dataset)
            for dataset in variable["additional_datasets"]
        ]

    print(yaml.dump(recipe, width=200, Dumper=FlowStyleDumper), end="")


if __name__ == "__main__":
    main()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

5 participants